11,089 research outputs found
NetScore: Towards Universal Metrics for Large-scale Performance Analysis of Deep Neural Networks for Practical On-Device Edge Usage
Much of the focus in the design of deep neural networks has been on improving
accuracy, leading to more powerful yet highly complex network architectures
that are difficult to deploy in practical scenarios, particularly on edge
devices such as mobile and other consumer devices given their high
computational and memory requirements. As a result, there has been a recent
interest in the design of quantitative metrics for evaluating deep neural
networks that accounts for more than just model accuracy as the sole indicator
of network performance. In this study, we continue the conversation towards
universal metrics for evaluating the performance of deep neural networks for
practical on-device edge usage. In particular, we propose a new balanced metric
called NetScore, which is designed specifically to provide a quantitative
assessment of the balance between accuracy, computational complexity, and
network architecture complexity of a deep neural network, which is important
for on-device edge operation. In what is one of the largest comparative
analysis between deep neural networks in literature, the NetScore metric, the
top-1 accuracy metric, and the popular information density metric were compared
across a diverse set of 60 different deep convolutional neural networks for
image classification on the ImageNet Large Scale Visual Recognition Challenge
(ILSVRC 2012) dataset. The evaluation results across these three metrics for
this diverse set of networks are presented in this study to act as a reference
guide for practitioners in the field. The proposed NetScore metric, along with
the other tested metrics, are by no means perfect, but the hope is to push the
conversation towards better universal metrics for evaluating deep neural
networks for use in practical on-device edge scenarios to help guide
practitioners in model design for such scenarios.Comment: 9 page
Tracing Forum Posts to MOOC Content using Topic Analysis
Massive Open Online Courses are educational programs that are open and
accessible to a large number of people through the internet. To facilitate
learning, MOOC discussion forums exist where students and instructors
communicate questions, answers, and thoughts related to the course.
The primary objective of this paper is to investigate tracing discussion
forum posts back to course lecture videos and readings using topic analysis. We
utilize both unsupervised and supervised variants of Latent Dirichlet
Allocation (LDA) to extract topics from course material and classify forum
posts. We validate our approach on posts bootstrapped from five Coursera
courses and determine that topic models can be used to map student discussion
posts back to the underlying course lecture or reading. Labeled LDA outperforms
unsupervised Hierarchical Dirichlet Process LDA and base LDA for our
traceability task. This research is useful as it provides an automated approach
for clustering student discussions by course material, enabling instructors to
quickly evaluate student misunderstanding of content and clarify materials
accordingly.Comment: 6 pages, 4 figures, Course project for UofA CMPUT 660, Winter 201
Resolution- and throughput-enhanced spectroscopy using high-throughput computational slit
There exists a fundamental tradeoff between spectral resolution and the
efficiency or throughput for all optical spectrometers. The primary factors
affecting the spectral resolution and throughput of an optical spectrometer are
the size of the entrance aperture and the optical power of the focusing
element. Thus far collective optimization of the above mentioned has proven
difficult. Here, we introduce the concept of high-throughput computational
slits (HTCS), a numerical technique for improving both the effective spectral
resolution and efficiency of a spectrometer. The proposed HTCS approach was
experimentally validated using an optical spectrometer configured with a 200 um
entrance aperture, test, and a 50 um entrance aperture, control, demonstrating
improvements in spectral resolution of the spectrum by ~ 50% over the control
spectral resolution and improvements in efficiency of > 2 times over the
efficiency of the largest entrance aperture used in the study while producing
highly accurate spectra.Comment: 11 pages, 2 figure
Implications of Computer Vision Driven Assistive Technologies Towards Individuals with Visual Impairment
Computer vision based technology is becoming ubiquitous in society. One
application area that has seen an increase in computer vision is assistive
technologies, specifically for those with visual impairment. Research has shown
the ability of computer vision models to achieve tasks such provide scene
captions, detect objects and recognize faces. Although assisting individuals
with visual impairment with these tasks increases their independence and
autonomy, concerns over bias, privacy and potential usefulness arise. This
paper addresses the positive and negative implications computer vision based
assistive technologies have on individuals with visual impairment, as well as
considerations for computer vision researchers and developers in order to
mitigate the amount of negative implications
Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration
Recent improvements in object detection have shown potential to aid in tasks
where previous solutions were not able to achieve. A particular area is
assistive devices for individuals with visual impairment. While
state-of-the-art deep neural networks have been shown to achieve superior
object detection performance, their high computational and memory requirements
make them cost prohibitive for on-device operation. Alternatively, cloud-based
operation leads to privacy concerns, both not attractive to potential users. To
address these challenges, this study investigates creating an efficient object
detection network specifically for OLIV, an AI-powered assistant for object
localization for the visually impaired, via micro-architecture design
exploration. In particular, we formulate the problem of finding an optimal
network micro-architecture as an numerical optimization problem, where we find
the set of hyperparameters controlling the MobileNetV2-SSD network
micro-architecture that maximizes a modified NetScore objective function for
the MSCOCO-OLIV dataset of indoor objects. Experimental results show that such
a micro-architecture design exploration strategy leads to a compact deep neural
network with a balanced trade-off between accuracy, size, and speed, making it
well-suited for enabling on-device computer vision driven assistive devices for
the visually impaired
COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images
The COVID-19 pandemic continues to have a devastating effect on the health
and well-being of the global population. A critical step in the fight against
COVID-19 is effective screening of infected patients, with one of the key
screening approaches being radiology examination using chest radiography.
Motivated by this and inspired by the open source efforts of the research
community, in this study we introduce COVID-Net, a deep convolutional neural
network design tailored for the detection of COVID-19 cases from chest X-ray
(CXR) images that is open source and available to the general public. To the
best of the authors' knowledge, COVID-Net is one of the first open source
network designs for COVID-19 detection from CXR images at the time of initial
release. We also introduce COVIDx, an open access benchmark dataset that we
generated comprising of 13,975 CXR images across 13,870 patient patient cases,
with the largest number of publicly available COVID-19 positive cases to the
best of the authors' knowledge. Furthermore, we investigate how COVID-Net makes
predictions using an explainability method in an attempt to not only gain
deeper insights into critical factors associated with COVID cases, which can
aid clinicians in improved screening, but also audit COVID-Net in a responsible
and transparent manner to validate that it is making decisions based on
relevant information from the CXR images. By no means a production-ready
solution, the hope is that the open access COVID-Net, along with the
description on constructing the open source COVIDx dataset, will be leveraged
and build upon by both researchers and citizen data scientists alike to
accelerate the development of highly accurate yet practical deep learning
solutions for detecting COVID-19 cases and accelerate treatment of those who
need it the most.Comment: 12 page
Affine Variational Autoencoders: An Efficient Approach for Improving Generalization and Robustness to Distribution Shift
In this study, we propose the Affine Variational Autoencoder (AVAE), a
variant of Variational Autoencoder (VAE) designed to improve robustness by
overcoming the inability of VAEs to generalize to distributional shifts in the
form of affine perturbations. By optimizing an affine transform to maximize
ELBO, the proposed AVAE transforms an input to the training distribution
without the need to increase model complexity to model the full distribution of
affine transforms. In addition, we introduce a training procedure to create an
efficient model by learning a subset of the training distribution, and using
the AVAE to improve generalization and robustness to distributional shift at
test time. Experiments on affine perturbations demonstrate that the proposed
AVAE significantly improves generalization and robustness to distributional
shift in the form of affine perturbations without an increase in model
complexity.Comment: 6 page
Seeing Convolution Through the Eyes of Finite Transformation Semigroup Theory: An Abstract Algebraic Interpretation of Convolutional Neural Networks
Researchers are actively trying to gain better insights into the
representational properties of convolutional neural networks for guiding better
network designs and for interpreting a network's computational nature. Gaining
such insights can be an arduous task due to the number of parameters in a
network and the complexity of a network's architecture. Current approaches of
neural network interpretation include Bayesian probabilistic interpretations
and information theoretic interpretations. In this study, we take a different
approach to studying convolutional neural networks by proposing an abstract
algebraic interpretation using finite transformation semigroup theory.
Specifically, convolutional layers are broken up and mapped to a finite space.
The state space of the proposed finite transformation semigroup is then defined
as a single element within the convolutional layer, with the acting elements
defined by surrounding state elements combined with convolution kernel
elements. Generators of the finite transformation semigroup are defined to
complete the interpretation. We leverage this approach to analyze the basic
properties of the resulting finite transformation semigroup to gain insights on
the representational properties of convolutional neural networks, including
insights into quantized network representation. Such a finite transformation
semigroup interpretation can also enable better understanding outside of the
confines of fixed lattice data structures, thus useful for handling data that
lie on irregular lattices. Furthermore, the proposed abstract algebraic
interpretation is shown to be viable for interpreting convolutional operations
within a variety of convolutional neural network architectures.Comment: 9 page
PolyNeuron: Automatic Neuron Discovery via Learned Polyharmonic Spline Activations
Automated deep neural network architecture design has received a significant
amount of recent attention. However, this attention has not been equally shared
by one of the fundamental building blocks of a deep neural network, the
neurons. In this study, we propose PolyNeuron, a novel automatic neuron
discovery approach based on learned polyharmonic spline activations. More
specifically, PolyNeuron revolves around learning polyharmonic splines,
characterized by a set of control points, that represent the activation
functions of the neurons in a deep neural network. A relaxed variant of
PolyNeuron, which we term PolyNeuron-R, loosens the constraints imposed by
PolyNeuron to reduce the computational complexity for discovering the neuron
activation functions in an automated manner. Experiments show both PolyNeuron
and PolyNeuron-R lead to networks that have improved or comparable performance
on multiple network architectures (LeNet-5 and ResNet-20) using different
datasets (MNIST and CIFAR10). As such, automatic neuron discovery approaches
such as PolyNeuron is a worthy direction to explore.Comment: 5 page
Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets
The ImageNet dataset ushered in a flood of academic and industry interest in
deep learning for computer vision applications. Despite its significant impact,
there has not been a comprehensive investigation into the demographic
attributes of images contained within the dataset. Such a study could lead to
new insights on inherent biases within ImageNet, particularly important given
it is frequently used to pretrain models for a wide variety of computer vision
tasks. In this work, we introduce a model-driven framework for the automatic
annotation of apparent age and gender attributes in large-scale image datasets.
Using this framework, we conduct the first demographic audit of the 2012
ImageNet Large Scale Visual Recognition Challenge (ILSVRC) subset of ImageNet
and the "person" hierarchical category of ImageNet. We find that 41.62% of
faces in ILSVRC appear as female, 1.71% appear as individuals above the age of
60, and males aged 15 to 29 account for the largest subgroup with 27.11%. We
note that the presented model-driven framework is not fair for all
intersectional groups, so annotation are subject to bias. We present this work
as the starting point for future development of unbiased annotation models and
for the study of downstream effects of imbalances in the demographics of
ImageNet. Code and annotations are available at:
http://bit.ly/ImageNetDemoAuditComment: To appear in the Workshop on Fairness Accountability Transparency and
Ethics in Computer Vision (FATE CV) at CVPR 201
- …